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. Abstract 

pL^ . We present a quantum algorithm that additively approximates the value of a tensor network 

^\ I to a certain scale. When combined with existing results, this provides a complete problem for 

' quantum computation. The result is a simple new way of looking at quantum computation in 

(~| , which unitary gates are replaced by tensors and time is replaced by the order in which the tensor- 

Qh| network is "swallowed" . We use this result to derive new quantum algorithms that approximate 

4^ ' the partition function of a variety of classical statistical mechanics models, including the Potts 

, model. 

' — 1 Introduction 
cn ■ 

K*" I The discovery by Peter Shor in 1994 of a quantum algorithm for factoring n digit numbers in 
' poly{n) steps stimulated a large amount of interest in the power of quantum computation |Sho97j . 
O ■ Since then, the search for quantum algorithms that provide exponential speedup over the best 
' known classical algorithms has yielded a number of results: algorithms for a number of group 
. and number theoretic problems that, like Shor's algorithm, use the quantum Fourier transform as 
OO , the essential ingredient (e.g. |Wat011 Kup03 lvDH103[ IHal07j ). an algorithm for an oracle graph 
^P. ', problem that uses the notion of a quantum random walk jCCD"'"03] . and recently, algorithms for 
. !_H I approximating combinatorial and topological quantities such as the Jones Polynomial and the Tutte 
X ' polynomial |FKWn2[ IFKLWn2[ lAJLOGl IXSELOTj . These last algorithms related to the Jones and 
' Tutte Polynomial are fundamentally different from the previous algorithms. The work presented 
here began as a study of the core features of these algorithms. 

This work presents a simple new way of looking at quantum computation. The consequences 
are a) new quantum algorithms, b) the casting of the aforementioned Jones Polynomial and Tutte 
polynomial results in a new light, and c) a new geometric view of quantum computation that will 
hopefully lead to more new algorithms. 

The fundamental object for this new view is a tensor network which we now briefly describe 
(a precise description of tensor networks is given in Section [2]). A tensor network T(G,M), is a 
graph G, a finite set of colors that can be used to label the edges of G, along with a finite array 
of data £ Ai assigned to each vertex v £ V of the graph. This finite set of data My is of 
the following form: for each possible coloring of the edges incident to the vertex v, the vertex is 
assigned a complex number. Thus for a given coloring I of all the edges of the graph, each vertex 
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has an assigned value - we denote the product of these values by c;. The value of a tensor network 
T(G,Ai) is defined to be the sum of q over all possible labelings I of G. 

The notion of a tensor network geometrically captures many fundamental linear algebra concepts 
such as inner product, matrix multiplication, the trace of a matrix, composition of linear maps, 
and the dual space, to mention a few. Loosely speaking, it is these features that fundamentally link 
tensor networks both to quantum computation and to the various combinatorial and topological 
objects discussed in this paper. 

Here, we give a quantum algorithm that takes as input a tensor network and gives as output 
an additive approximation of the value of the tensor network to a certain scale. Together with 
previous results |SDV061 lALMOGj . this provides a complete problem for quantum computation. We 
then apply this result to two classes of problems. 

First we give new quantum algorithms for approximating an important quantity associated to a 
host of statistical mechanical models. Statistical mechanical models attempt to model macroscopic 
behavior of physical systems made from a very large number of microscopic systems that interact 
with each other. In this paper, we consider a broad class of models, called q-state models, an 
example of which is the well known Potts model. These models are described by a graph where 
the vertices are thought to be in one of q possible states. For a given assignment of states to 
the vertices, the energy of the model is given by a sum of local energy contributions of each edge, 
where the local energy is some function of the states of the two endpoints of the edge. The partition 
function of the model is the sum over all possible assignment of states of the vertices, of a particular 
exponential function of the energy of the model for that assignment (see section [5] for the details). 
It turns out that many interesting macroscopic properties of the system can be deduced solely from 
the partition function. These include the average energy of the system, its entropy, specific heat, 
and more elaborate properties such as phase-transitions |Cal85j . The calculation of the partition 
function is therefore an important task in the theory of statistical physics. Here we apply the 
main result to give an additive approximation of the partition function of any q-state statistical 
mechanical model. 

Second, we show how a specific application of the main result is used as an essential step in the 
recent quantum algorithms for additively approximating the Jones and Tutte polynomials. 

The fact that the approximations are additive and depend on the approximation scale is by no 
means a minor point: if the approximation scale is very large, the algorithm will produce estimates 
that are trivial, or at least no better than a classical approximation. 

So what can be said about the approximation scale in the algorithms presented in this paper? In 
general, of course, our result shows that there are plenty of problems (i.e., tensor networks) for which 
the level of approximation is non-trivial (assuming the power of quantum computation exceeds that 
of classical computation.) With respect to statistical mechanical models, a recent result contained 
in |VdNDR08| shows that the approximation scale of the algorithms presented here for certain non- 
physical statistical mechanical models is small enough to solve a BQP-complete problem. Whether 
the approximation scale is non-trivial for the statistical mechanics models with physical parameters 
remains unknown, though for a certain range of parameters the scale is superior to any classical 
algorithm known to the authors. 

Neither tensor networks nor additive approximations are new to the study of quantum com- 
putation. The fact that highly entangled quantum states, as well as quantum operations, can 
be efficiently represented by tensor networks has been the backbone of many studies that simu- 
late quantum systems, (see, for example, Refs. |Vid03[ IVid04l IVC04[ IMS051 ISDV06[ IVdNDVB07 [ 
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IVid07[ lHKH+08] ). 

Separately, as mentioned, additive approximations and quantum computation have been linked 
with the recent results that give quantum algorithms for additively approximating the Jones poly- 
nomial of braids and the Tutte polynomial of planar graphs, as well as the complementary re- 
sults that show that for certain parameters, these approximations are complete quantum problems 
IFKW021 IFKLW021 IAJL061 lAAELOTj . Motivated by the Jones polynomial result, the computa- 
tional complexity of additive approximations has been further investigated in |BFLW05] . 

The view of quantum computation as an additive approximation of tensor networks provides 
a useful unifying lens through which to view these two classes of results. The above mentioned 
results related to classical simulation can be seen as showing that certain restrictions on the form 
of a tensor network allow for classical evaluation. The algorithmic results for the Jones and Tutte 
polynomial, as well as the statistical mechanical algorithms presented here, can be seen as the 
quantum approximation of specific tensor networks whose value is a quantity of interest. 

An interesting consequence of the main result presented here is that two core features of quantum 
circuits: the unitarity of the gates, and the notion of time (i.e. that the gates have to be applied 
in a particular sequence) are replaced by more flexible features. The unitary gate is replaced by an 
arbitrary linear map encoded in each tensor. The notion of time and sequential order of a circuit 
is replaced by the geometry of the underlying graph of the tensor network along with a choice of 
"bubbling" of the network, a concept that is explained in Sec. [3l Unlike a quantum circuit, which 
is ordered in a unique way, a given tensor-network has many possible bubblings. 

An outline of this paper is as follows. We begin by giving precise definitions of tensor networks 
in Sec. [21 and then in Sec. [3] we prove the central structural theorem: that the approximation of 
tensor networks with the scale prescribed is a problem a quantum computer can perform efficiently 
(Theorem 13. 4p . Section 2] then shows that this approximation problem is a complete problem for 
quantum computation. In addition. Sec. H] contains a discussion of the approximation error of this 
result when applied to particular families of tensor networks. We then present quantum algorithms 
for approximating the partition function of the statistical mechanics models in Sec. [5] (which include 
the Ising, clock, and Potts model). Sec. [6] contains a brief discussion of tensor networks related to 
some topological invariants. We offer a summary and discussion in Sec. El 

2 Preliminaries 

2.1 Notation 

We fix a g dimensional Hilbert space H = and an orthonormal basis for H which we denote by 
{|0), \1) , . . . \q — 1)}. For a finite set S, \S\ will denote the number of elements of S. A graph G 
will be denoted by a pair (V, E), where V is the set of vertices and E is the set of edges. 

For any linear operator A over some Hilbert space H, the norm of A shall mean the operator 
norm of A and be denoted by \\A\\. The term poly{t) shall be used to denote some unspecified 
polynomial function of t. 

2.2 Tensor Networks 

Tensors are mathematical objects that appear in many branches of mathematics and physics. They 
can be defined in many ways; for our purposes, we define them as follows: 
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Definition 2.1 (A Tensor) A tensor M of rank k and dimension q is an array of numbers 
that are denoted by Mi^^___^i^, with 1 < s < /c being indices that take on the values < is < q — 1- 

In the rest of the paper, we will always assume that all of the tensors we deal with are of complex 
numbers and are tensors of a fixed dimension q. 

We now describe a couple of useful operations on tensors. Given a rank-A: tensor A and a rank-£ 
tensor B, the product A (8) B shall be the rank (k + i) tensor that is just the tensor product of the 
two tensors: 

(-^ ® -^)n---«fc ~ Aii...ii:Bjj,,,jg . 

For a rank-A; tensor A, and two indices i,m, 1 < i < m < k, the contraction of M with respect 
to I and m shall be the rank k — 1 tensor C given by the following equation: 




-liSi^f + l-'-^m-liSilm+l-.-lfe • 



Combining these two operations together, we can talk about the contraction of two tensors, which 
is the result of taking their product and then contracting the resulting tensor. For example, the 
tensor Cj j = ^ ^i,k,iBj,k,t is the contraction of the two rank-3 tensors and Bj ^^i along the 
k,i indices. Notice also that 

Remark 2.1.1 Contraction can be thought of as a generalization of the notions of inner product 
and matrix multiplication. The contraction of two rank-1 tensors can be seen as an inner product 
between two vectors. The matrix product formula Cij = Ai^f^Bf^j can be viewed as the contrac- 
tion of the product of two rank-2 tensors A and B with respect to the second index of A and the 
first index ofB. 

In general, we will be interested in the contraction of the product of many tensors over multiple 
indices. It is an important observation (that can be easily checked) that the order of products and 
contractions does not matter as long as we keep proper track of the appropriate indices. 

This observation leads us to the central object of this paper, the tensor network. This is an 
extremely useful graphical picture of tensors, products, and contractions that we now describe. 
A rank-A: tensor A shall be represented as a vertex with k edges incident to it ~ each edge shall 
correspond to one index of A. The product of two tensors will be represented as the disjoint union 
of two such pictures and the contraction of a tensor along the i and m indices shall be represented 
by joining the edge corresponding to the tth index with the edge corresponding to the m'th. With 
this description, a series of products and contractions of tensors becomes a graph with labeled 
vertices and a certain number of free edges. The number of free edges is exactly the rank of the 
tensor that results from the products and contractions. Examples of such diagrams are given in 

Fig.m 

We shall be particularly interested in cases where all indices are contracted to yield a single 
number, or, equivalently, when the associated graph has no free edges: 

Definition 2.2 (Tensor network) A tensor network is a product of tensors that are contracted 
together such that no free indices are lef^ It is denoted by T{G,M), with G = iy,E) being a graph 



^We note that in other works in quantum information, it is conventional to include also graphs with free edges in 
the definition of tensor networks. Nevertheless, here we narrow the definition to fully contracted graphs. 
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Fi gure 1: A graphical representation of tensors: Fig. (a) denotes a rank-3 tensors jg, Fig. (b) 

denotes the rank-4 tensor Bj-^j^JaJi^ Fig. (c) denotes their contraction '^hA,i3^k,j2,j3,j4- 

In 

the following, we will usually omit the labeling of tensors' indices. 




Figure 2: 



An example of a simple tensor network that is given by T(G, M) 



and A4 = {M.^\v S V} a set of tensors. For each v £ V, the rank of the tensor M^, is equal to 
the degree of v, with every index ofM.^ being associated with an adjacent edge of v. Finally, each 
edge denotes a contraction of the two indices that correspond to its ends. The value of the tensor 
network is the number that results from the series of products and contractions described by the 
network. When the context is clear we shall use T{G,Ai) to denote the value of the network as 
well as the network itself. 

With these definitions, a tensor network is nicely described pictorially by a graph, as demonstrated 
in Fig. m 

The definition of a tensor network motivates a different notation that will be especially helpful 
when studying statistical models. Given a tensor network T(G,A4), we define an edge labeling I to 
be an assignment of an integer 0, 1, . . . , g — 1 to each edge of G. The network tensors can be viewed 
as functions of these labelings: My{l) (Mt,)j^^...^j^ where the value of the indices (ii,i2, ■ ■ ■ lik) 
are defined by the labeling I. With this notation, the value of a tensor network can be neatly 
written as a sum over all possible labeling of the edges: 



TiG,M)= Yi n^-w 



(1) 



labeling I vS.V 

2.3 Tensors as quantum states and quantum operators 

There is an extremely useful relation between tensors and quantum states and quantum operators. 
Given a Hilbert space H^^ (recall that H = C) with a fixed basis {|ii) <8) {12} ■ ■ ■ 'S> \ik)}, there is 
a 1-1 mapping between rank-A; tensors of dimension q and vectors in the Hilbert space, given by: 

1 



M ^ \M) 



dcf 



E 



In) (8) 1^2/ ^ • • • ^ \ik) 



n,...,ifc=i 
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indices indices 



Figure 3: A graphical illustration of the action of the swallowing operator M^'-^. It takes a /c-rank 
tensor into a £-rank tensor by contracting its k indices with its K indices. The resultant contraction 
has £ indices - the indices that come from the L indices of M'^'^. 



In addition, we can also identify tensors with linear maps from one Hilbert space to another. 
Given a rank-n tensor M, we partition the indices of M into two sets K and L. Set k = \K\^ 
i=\L\. Then define M^-^ : H^'' H®'^ to be the map: 

M.^,L ^ Mi,,,,i,j,,,,j^\h)(^---(g)\^){ji\(^---(^{jk\, (2) 

where in the above sum the i variables range over all possible values of the indices in L and the j 
variables range over all possible values of the indices in K. We further note that although we wrote 
-^ji...i£ji...jfci this will only be correct if the set L consisted of the first i indices of M and L the 
last k indices. In general the i and j indices will be shuffled around to correspond to the locations 
of K and L. Alternatively, we can define M^'^ inside the pictorial world of tensor-networks: it is 
the linear map that takes rank-/c tensors to rank-£ tensors. Given a rank-A: tensor A, contract the 
k indices of A with the indices K of M, and the result is a rank-£ tensor with the indices L. This 
is demonstrated in Fig. [3l 

2.4 Additive approximations 

The main result of this paper shows that every (finite) tensor network admits an efficient quantum 
additive approximation. In this section we define this type of approximation. 

Roughly speaking, an additive approximation algorithm for a quantity X provides an approxi- 
mation within the range [X — A/poly {n), X + A/poly (n)] with A being the approximation scale and 
n the running time of the algorithm . The approximation allows errors up to A/ poly {n), whereas 
a multiplicative approximation only allows errors up to \X\/poly{n). Since A can be arbitrarily 
larger than \X\, we have a weaker notion of approximation. Nevertheless, it appears most suitable 
in describing the performance of many quantum algorithms, in particular those which deal with 
topological invariants such as the Jones polynomial |FLW02l lFKW02l IFKLW021 IBFLW051 lAJLOGj . 

In [BFLWOS] this type of approximation and its relation to quantum computation were studied. 
We therefore adopt their definition of this approximation with some minor adjustments. 

Definition 2.3 (Additive approximation) A function f : {0, 1}* — t- C has an additive approx- 
imation with an approximation scale A : {0, 1}* — t- 1R+ if there exists a probabilistic algorithm that 
given any instance x £ {0, 1}* and e > 0, produces a complex number V{x) such that 

Pr(|y(x)-/(x)|>eA(x)) <i , (3) 

in a running time that is polynomial in \x\ and . 
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Notice that we did not specify the type of approximation algorithm; it can be either classical or, as in 
this paper, quantum. Note also that the 1/4 parameter in the definition can be replaced by constant 
6 £ (0, 1/2), since we could reduce this error probability in polynomial time by taking several runs 
of the algorithm. Finally, notice that by setting A(x) *== |/(x)|, we recover the definition of an 
FPRAS (Fully Polynomial Randomized Approximation Scheme). 

As noted in the introduction, additive approximation can be trivial if the approximation scale 
A is too large. In the quantum case, the approximation scale might be non-trivial, yet it might be 
classically reproducible. We discuss this problem in Sec. HI 

3 Approximating a tensor network with a quantum computer 

3.1 Outline of the algorithm 

We now describe the quantum algorithm that gives an additive approximation of a tensor network. 
We begin with an informal pictorial description of the algorithm. Given a tensor network, we 
imagine its graph embedded in R^, and a large bubble that approaches the graph and starts 
"swallowing" it one vertex at a time, as illustrated in Fig. [H Every time it swallows a vertex, it also 
swallows some of its adjacent edges, while the rest of the adjacent edges are only "half swallowed"; 
they intersect with the boundary of the bubble. Thus between swallows, what remains are the 
vertices that we have not yet swallowed, the edges between those vertices, as well as half-edges that 
join swallowed vertices with un-swallowed ones. In the end, once all vertices have been swallowed, 
we are left with a rank-zero tensor that is simply a number - the value of the tensor-network. 

Our algorithm will mimic this swallowing step by step by creating a state related to the tensor 
of the swallowed part of the graph. The act of swallowing a new vertex will be mirrored by an 
application of a "swallowing" operator. The swallowing of a graph can be therefore viewed as 
a generalization of a standard quantum computation, with the vertices being the gates, and the 
bubbling determines the order in which these gates are applied. 

There is one obstacle, however, that one has to cross: the swallowing operators are linear, but 
not necessarily unitary. In order to implement them on a quantum computer, we will use a standard 
trick: we will simulate the non-unitary operator as a sub-unitary operator. This is done by adding 
an ancilla qubit to the system, followed by a global unitary operation, and then a projection on a 
specific subspace of the ancilla. The resultant state will be a scaled version of the vector we would 
have obtained by the non-unitary transformation, with the scaling factor being the norm of the 
non-unitary operator. As we shall see, this will result in an approximation scale that is the product 
of the norms of the swallowing operators. 

3.2 Bubbling and swallowing operators 

We begin with formal definitions of a bubbling and the resultant swallowing operators discussed 
above. 

Definition 3.1 (Bubbling of a graph) A bubbling B of a graph G = {V,E) shall mean an or- 
dering of all the vertices of G, 

Vl,V2,V'i,... . (4) 
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Figure 4: An illustration of a bubbling of a tensor network with 8 vertices. In the beginning, the 
bubble is away from the graph, but then it starts swallowing its vertices one by one. The process 
ends when all the vertices are inside the bubble. 




'S3 'S2 'Si ^Sq 
Figure 5: The first 4 stages of a bubbling of a graph. 

This ordering induces a sequence of subsets 

= 5o c 5i c 52 c • • • c = V, (5) 

with Si = {vi, . . . , Vi}. For each i, we define Zi C E to be the set of edges with exactly one endpoint 
in Si. A graphical illustration of this process is shown in Fig. and Fig. O 

Given a tensor network T{G,M), a bubbling B of G also defines a sequence Aj, < i < n of 
n + 1 tensors as follows. For every i, cut the tensor network at the edges in Zi] this divides the 
network into two pieces, one piece contains all the vertices of Si, the other contains the remaining 
vertices. Define Aj to be the rank \Zi\ tensor represented by the first piece of the dissected graph 
(this corresponds to the tensor of the swallowed part of the graph in our informal description). 
For the special i = case, Aq has no indices; it is a single number, which we define to be 1. The 
last tensor, A„, is also a zero-rank tensor: it corresponds to the contraction of the entire network, 
hence its value is T{G,A4) - the number we are trying to approximate. 

The relationship between Aj and Aj+i is clear: Aj+i is obtained from Aj by contracting it 
with Mt,. over the indices in Zi. This action is familiar, it is just the application of the map 
with I = Zi contracted with the corresponding edges of Aj in Z,. This leads us to the following 
definition: 
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Definition 3.2 (The swallowing operator) Let T{G,M) be a tensor network with a bubbling 
B = (vi,V2, ■ ■ ■ Vn)- For every integer i S {1, . . . n} define: 

• K to be the set of input edges ~ all edges in that are connected to vi. These edges 
connect Vi to Si^i. 

• L to be the set of output edges - all edges in Zi that are connected to Vi. These edges connect 
Vi toV\ Si. 

• J to be the set of untouched edges - edges in Zi-\ that are not adjacent to Vi (these edges 
must also be in Zi). 

We define the swallowing operator Oy. to be a linear operator that takes states from i^^l-^^-il 
to H®\^^\ by 

a. = lj® M^'^ , (6) 

where by Ij we mean the identity operator on the indices corresponding to the untouched edges of 
J. 

With this definition, it is clear that 

= ajAi) , (7) 

and we are ready to state the formal approximation algorithm in the next section. 



3.3 Implementation on a quantum computer 

As mentioned in the beginning of the section, the main technical difficulty in implementing the 
algorithm on a quantum computer is the fact that the swallowing operators might be non-unitary. 
The following lemma serves as the central building block of the algorithm. It shows the well-known 
result that any such operator can be implemented on a quantum computer using ancilla qubits, 
unitary operators and a final projection. 

Lemma 3.3 Given a linear map A : H®^ — )■ H®^ , let B denote the space corresponding to a qubit 
with computational basis {|0), |1)}. Then there exists a unitary operator U : H®^ (X" B — )• H®^ (g) B 
such that 

f^(l«> ® |0» = p^(^l«)) ® |0) + l/32> ® |1> . (8) 

Furthermore, U can be implemented on a quantum computer in time poly(g'^) with exponential 
accuracy, (where q is the dimension of H). 

Proof: Set m = q^. Using the fact that every linear operator has a singular value decomposition, 
we can write = V1DV2 where Vi and V2 are unitaries and D is a diagonal matrix with diagonal 

entries 

1 > n > r2 > . . . > r„, > . (9) 
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Now define the map Ud : H®" B ^ H®'' (g) B as follows: every vector in H® ® B can be written 
as unique super position 

l«> = l/3o>®|0) + |/3i)»|l) . (10) 
Then action oiUo on \a) is defined by 

UdW) = (D|/3o) + ^Jl-D^\h)) » |0) + (-Vl-I?2|/3o) + $5 11) , (11) 

where ^/Y^^ID^ is the diagonal matrix with i'th entry yjl — rf. It is a simple calculation to verify 

that Ud is unitary. Setting U =^ {Vi lB)t^D(V'2 ® 1b), Eq. ^ follows. 

It remains to show that the operation of computing and applying U can be done in quantum 
poly(m) time (recall m = q^). The computation of the singular value decomposition can be done in 
classical poly(m) time. We are left to implement the three unitaries Vi (8) 1, Ud and V2 C?) 1. Since 
these operators act on an 2m dimensional space they are therefore (log m)-local qubit operators. 
The simulation of unitary operators that act on logm qubits is a standard procedure in quantum 
computation based on the Solovay-Kitaev theorem and can be done in poly(m) quantum time 
|KSV021 IDNOSj and thus the whole process can be completed in poly(m) time. H 

With this lemma in hand, we are ready to prove the central result of this paper: 

Theorem 3.4 (Additive quantum approximation of a tensor-network) Let G = {V^ E) he 

a graph of maximal degree d, and let T{G,M) be a tensor-network of dimension q defined on G. 
For a given bubbeling B = {vi,V2,V3, . . .) ofG, let {O^l^^^ the corresponding swallowing operators 
from Def. \3.2l Then for any error parameter e > 0, there exists a quantum algorithm that runs in 
\V\ ■ e"^ • poly{q'^) quantum time and outputs a complex number r, such that 

Pr(|r(G,A^)-r| >eA) < i , (12) 

with 

A = nii^-ii- (13) 

Proof: We set n As noted the end of Sec. 13.21 given a tensor network and a bubbeling 

B = {vi, . . . , Vn), the process of swallowing defines a series of vectors that live in possibly different 
Hilbert spaces. We start with a normalized vector that lives in a one-dimensional Hilbert space. 
Then 

\A2) = 0,,\Ai) = , 

: (14) 

\An)=0,„---0,,\n)=T{G,M)\n) . 

The above states, of course, cannot be directly stored in a quantum computer because they are 
not necessarily normalized. We solve this problem by moving to a larger Hilbert space by adding 
ancilla qubits. Specifically, at the i'th step, the state of swallowing process will be stored in a 
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normalized state {ipi) that lives in the Hilbert space space where \Ai) 

lives (see Sec. 13. 2p . and B®* is the space of i ancilla qubits. Any vector in that space can be 
uniquely expanded in the standard basis of the ancilla qubits: 

|(^) =^|^,)®|s) . (15) 

s 

Here \vs) G and \s) is a standard basis element of the ancilla subspace that corresponds to 

the classical string s of z bits. In terms of this expansion, the state \tpi), which encapsulates the 
state of the swallowing process, will be given by 

m= 11,^ II ^ 11,^ \Ai)0\O)0---(^\O) + ... (16) 

i ancilla qubits 

In other words, projecting the ancilla qubits of on |0) (8) • • • (8) |0), will result in the state 

\Ai) 



V2 I 



Notice that by Eqs. (jl4p . the norm of this vector is necessarily smaller than or equal to 1. 

Let now show that we can efficiently generate the states in Eq. (|16p using a quantum 
computer. The proof is by induction. We start by generating \ipi). Denote by k the degree of 
the vertex vi. Since this is the first vertex to be swallowed, Oy-^ has no input edges and exactly 
k output edges. Its domain is therefore a trivial one dimensional space, which is spanned by a 
normalized vector \Q), and its co-domain is H^^. 

Then we define the operator O^,-^ : H'^^ — > H®^ by 

ai(|0)»---»|0)) =OM = \Ai) (17) 

and dy^\a) = for every \a) G H®^ that is orthogonal to |0) • • - (g) |0). Notice that WO^^W = \\Ov^\\. 
We initialize one extra ancilla qubit to |0) and apply Lemma [3T3l to the all zero basis state to get 

|V'i> = ^^l^i>®|0) + ... (18) 
II i-^tii II 

Here, as in Eq. (|16p . the 3 dots stand for the rest of the terms that one obtain when decomposing 
IV'i) according to the standard basis of the ancialla qubit (in other words, here it would be some 
vector \(f)) (g) |1)). Notice also that the whole process can be done in poly(q''^) < poly(g°') quantum 
time. This proves the i = \ case. 

Assume now that we have "swallowed" z — 1 vertices, and have created the state By 
Def. 13.21 0„. = 1 J (8) M^'^, with K corresponding to the set of input edges, L to the set of output 
edges, and J to the set of untouched edges that are not connected to Vi. We can therefore ignore all 
the registers that correspond to the untouched edges and concentrate only on the "active" registers 
K, L. These are at most d registers, each holding numbers between 0, . . . ,q — 1. All together they 
are therefore described by mostly dlogq qubits. 

Define k = \K\ and £ = \L\, (note that k + i < d), and assume first that k = i. Then we 
initialize one new qubit to |0) and apply Lemma 13.31 to M^'^ (which is now a square matrix), and 
transform \ipi-i) into {ipi) with the property that 

1^*) = ^ M^'^|^,_i)] ^ |O)0. . .0|O)^ + . . . (19) 

i qubits 
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Using the fact that HM-^'-^H = \\Oy^\\, and that [Ij ^ M^^^)\Ai_i) = Oy^\Ai_i) = \Ai), we find 
that satisfies (jl6p . Moreover, this transformation is done in poly{q'') < poly((/'^) time. 

When k < £ we simply add £ — k input registers and set them to 1 0) . We redefine M^'^ to be 
a square matrix that operates identically as the original transformation, provided that the extra 
k — I registers are all set to |0), and otherwise acts as the zero operator. This guarantees that the 
M^'^ preserves its norm. We can now repeat the k = I case. Notice that this process can also be 
done in poly(g^) < poly(g'^) time. 

Similarly, in the k > £ case, we add k — £ output registers and redefine M^'^ to always set them 
to |0), hence preserving its norm. This finishes the proof of the induction. 

Repeating this process all the way up to i = n, we generate in n ■ poly(g'^) time the state 



T{G,M) 

njiaji 



|0)®---® |0) + 

V ' 

n ancilla qubits 



(20) 



which lies in a Hilbert space that is composed entirely of ancilla qubits. We now use the Hadamard 
test (see Appendix to estimate the inner product of |V'n) with |0)(8>- • -(EilO). For any given e > 0, 
we generate 0{e~'^) copies of and after the appropriate measurement, we obtain a complex 
number r' such that 



Pr 



T{G,M)/\{\\0,,\ 



> e 



1 

< - 
- 4 



(21) 



All together, this is done in |y| • e ^ • poly(g''^) time. Multiplying by A = J|- HO^. ||, and outputing 
r = Ar', proves the theorem. 



We note that A, the additive scale of approximation of a given tensor network, depends on the 
choice of bubbling. The issue of finding a good bubbling for a given tensor network is far from 
trivial. In fact, it is known that the closely-related problem of finding the tree width |MS05j of an 
arbitrary grapt@ is NP-hard [ACP87j . In the present work we treat the bubbling as an external 
object, given to us together with the tensor-network. In SeclU where we construct tensor-networks 
that calculate the partition functions of classical models, we employ a simple bubbling that seems to 
have certain advantages, but in no way does this mean that these are optimal. Nevertheless, already 
from these examples it becomes clear how different bubblings can yield very different approximation 
scales for the same network. 



4 The hardness and completeness of approximating a tensor net- 
work 

As mentioned earlier, special attention needs to be paid to the nature of the additive approxima- 
tion in Theorem 13. 4i Additive approximations are tricky; the approximation scale A might be 
exponentially larger than |T(G,7W)|, in which case the output of the algorithm is meaningless. 
Unfortunately, ruling out this possibility is difficult: if we want to bound the ratio between A and 

^The tree width of a graph is equivalent to the best bubble width of a graph [XLM06| . For a given bubbUng, the 
bubble width is the maxmial number of edges crossing the bubble during the swallowing process. The best bubble 
width is therefore the minimal bubble width over all bubblings of the graph. 
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\T{G, M)\, we must have some other, external estimate for the latter, which might generally be a 
difficult task. 

We address this issue in a two ways. First, we will consider the hardness of the approximation. 
We will show that our additive approximation is BQP-hard for certain classes of networks, which 
will prove that the approximation is non-trivial for many instances of the problem. 

Second, we will focus on the completeness of the approximation. We will argue that Theorem l3.4l 
can be viewed as a new way of casting quantum computation rather than as an approximation result. 

We begin with a brief review of known complexity results for the evaluation of tensor networks. 

4.1 Classical hardness results 

Tensor-networks can be very hard to evaluate; they are the sum an exponential number of terms. 
A well-known example is the 3-coloring problem of a graph |Pap94| . We say that a graph G is 
3-colorable if it is possible to color its vertices in one of 3 colors such that adjacent vertices would 
always be colored differently. Deciding whether a graph is 3-colorable or not is a famous NP- 
hard problem, even when restricted to the class of planar graphs of degree 4 |GJS74) . Moreover, 
counting the number of possible colorings in known to be a ^P-complete problem (see, for example, 
[DGGJOi] ). However, given a graph G = {V, E), it is an easy exercise to construct a tensor network 
that counts its total number of 3-colorings: set q = 3, define the tensors at the vertices to give 1 
if all their edges are colored identically and zero otherwise, and finally place new vertices in the 
middle of each edge, and define its tensor to give when its two edges are labeled with the same 
color and 1 otherwise. We leave it to the reader to verify the correctness of this construction. 

So exact evaluation of a tensor- network might be #P-hard. But what about approximations, in 
particular multiplicative approximation? As we allude to in Sec. El one can define a tensor-network 
that calculates the Tutte polynomial of a planar graph. This is a two-variable polynomial that 
can be defined for every graph G. It encodes an extremely wide range of interesting combinatorial 
properties of G, making it central in graph theory |Wel931 p. 45]. Its exact evaluation turns 
out to be #P-hard at all but trivial points | JVW90j . Moreover, recent results show that even a 
multiplicative approximation to it (FPRASH is NP-hard, (and sometimes even #P-hard) for a large 
part of the Tutte plane |GJ07| . Therefore there exists families of tensor-network for which FPRAS 
approximation is also NP-hard. 

Finally, it turns out that additive approximations can also be NP-hard. Indeed, a simple 
construction in Theorem 4.4 of |BFLW05] shows that an additive approximation of the q'-coloring 
problem with a scale A = {q — 1 — (5)1^1, for any < 5 < q — 1 and g > 3 is NP-hard since it can 
be used to decide whether a graph is q-colorable. 

4.2 Quantum hardness results 

We now show that there exist classes of tensor-networks for which additive approximations are 
BQP hard. As we have seen in Sec. 12.31 tensor networks can represent quantum states, as well 
as linear maps over these states. It is not surprising that quantum circuits can be represented 
by tensor networks. Indeed, this observation appears in many recent studies that try to draw 
the border between quantum and classical complexities, characterize the nature of entanglement, 
and find efficient algorithms to simulate certain classes of quantum systems (see, for example, 

^See the paragraph following Definition 12.31 for a definition of an FPRAS approximation. 
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Figure 6: Constructing a quantum circuit U from another circuit Q such that (0®"|?7|0®") is equal 
to the probabihty measuring the first qubit of QjO®") in the state |0). 
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Refs. |Vid03l IVid04l IVCOil [M505l ISDV061 IALM061 IVid07l lHKH+08p . For sake of completeness, 
we show how this encoding can be done, but see |MS05j for a broader view. 

Consider a quantum circuit Q = Ql ■ ■ ■ ■ ■ Qi that is defined on n qubits. Denote by po the 
probability of measuring a in the last qubit of the original circuit Q applied to |0®"). To perform 
universal quantum computation, it is enough to distinguish between the cases when pQ < 1/3 
and Po > 2/3 for any circuit Q. We define a related circuit U on n + 1 qubits: U applies Q to 
|0®") = |0)®", then copies the last qubit of Q to the additional qubit by a CNOT gate, and then 
applies (see Fig. [6]). It is a straightforward algebraic exercise to show that (0'^"|C/|0®"') = po for 
the original circuit Q. We will construct a tensor network whose value is T{G,M) = (0®'"|J7|0®") 
such that a straightforward bubbling of this network, which is associated with the original ordering 
of the circuit, will yield an approximation scale A = 1. This will enable us to distinguish between 
the two cases (0®"|C/|0®") = po > 2/3 or (0®"|C/|0®") =po<l/3 and thus show that the problem 
of additively approximating tensor networks to the scale described is quantum hard. 

Let us now define the tensor-network. The dimension of every tensor is q = 2, corresponding 
to the the two possible values of a qubit. The network consists of 3 types of tensors: 

• Every d-local gate Q is translated into a 2d-rank tensor with d input edges and d output 
edges: 

Km,...,^. = {h\^...r^ mQ\h) ^■■■^\k,)^lj^ll (22) 

• Every input qubit |0) is translated into a rank-1 tensor 

mI'^ = 6kfl ^ ^ (23) 

• Every output qubit (0| is translated into a rank-1 tensor 

Mf=<5,,o^»^ (24) 

Contracting these tensors according to the topological structure of the circuit, we obtain a tensor 
network T{G,M), and it is a straightforward exercise to check that T{G,M) = (0®"|?7|0®"). 
Finally, when bubbling the network according to the natural evolution of the circuit, the swallowing 
operators of the tensors associated to the gates become the gates themselves, hence their norm is 
1. Similarly, the norms of the swallowing operators that are associated with Eqs. (j23l I24p . are also 
easily seen to be 1. All in all we have an approximation scale A = 1. 
We therefore reach to the following corollary: 

Corollary 4.1 There exist families of tensor networks for which the additive approximation in 
Theorem \3.4\ is BQP-hard. In particular, all families that correspond to families of universal quan- 
tum circuits by the construction above. 
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It is therefore evident that, for certain cohections of tensor networks, the approximation in 
Theorem 13.41 is non-trivial. This does not necessarily hold for every member in such "universal" 
families of networks, only for the family as a whole. The universal families cited in the above 
corollary originate from quantum circuits and the quantum universality of their approximation 
relies heavily on the unitarity of their operators. There are, however, other universal families of 
tensor networks that are not so tightly related to quantum computation. In Sec. [6l we refer to one 
of these families, a family of tensor networks that approximate the multivariate Tutte polynomial. 
Unlike the example above, their underlying operatorial structure is non- unitary. The proof that 
approximating these tensor-networks is quantum-hard can be found in [AAELOT] . 

The fact that quantum circuits can be viewed as tensor networks is the main theme in the 
paper of Markov &: Shi |MS05j and later in |ALM06j . These papers study the question of when a 
tensor-network can be evaluated classically. [MS05j uses the notion of tree width of a graph, which 
is equivalent to the notion of bubble width that is used in |ALM06j (see footnote on page \12\i. It 
is shown that a sufficient condition for an efficient evaluation of a tensor-network is that the tree 
width (or, equivalently, its bubble width) of the graph is of logarithmic size. To minimize the 
running time of the simulation, one should choose a bubbling with a minimal bubble width. This is 
done regardless of the original ordering of the circuits. This leads us to the reconsider Theorem 13.41 
as essentially a new view of quantum computation, which we discuss in the next section. 

4.3 Completeness: tensor networks as a different point of view on quantum 
computation - the role of interference 

Loosely speaking, a useful quantum algorithm must manipulate the interferences of the wave func- 
tion in a smart way; an instance of a YES/NO problem must be mapped to a circuit that pro- 
duces a constructive/destructive interference. Formally, a quantum algorithm solves a decision 
problem, if for every instance x, we can (efficiently) generate a quantum circuit Ux such that 
(0®"|C/^|0®") < 1/3 for a YES instance and (0®"|C/^|0®'") > 2/3 for a NO instance. 

In view of Theorem l3.4l we can rephrase this demand in terms of tensor-networks. The construc- 
tive interference demand translates into |T(G, A^)| being of the same order of the approximation 
scale A (or, only polynomially smaller). In other words, we care less about the actual value T(G, M) 
of the tensor network and more about the ratio \Tx{G, A4)\/ A^. The following corollary can be seen 
as an alternative formulation for an efficient quantum computation that stresses this point: 

Corollary 4.2 (Efficient tensor-network based quantum computation) A decision problem 
is in BQP if and only if there is an efficient classical transformation that maps any instance x of 
the problem into a tensor-network Tx{Gx, M^) over a fixed q and a graph Gx of a maximal degree 
d, and a corresponding bubbling Bx such that: 

X is a 'YES' instance =^ \Tx{G,M)\ ^ 2/poly(|x|) , (25) 

X is a 'NO' instance \Tx{G,M)\ ^ ^ (26) 



where Ax is as in Theorem]^. 4 



This formulation generalizes our notion of quantum computation in two ways: 

• Time evolution of the circuit is no longer fixed, we may pick any bubbling that provides the 
best approximation scale. 
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• Unitarity is gone. The operators no longer need to be unitary or even have the same domain 
and co-domain. We are free to construct circuits with non-unitary gates, as well as graphs 
with arbitrary topology, as long as the vertex degree is bounded. The approximation scale, 
however, might no longer be 1. 

We are hopeful that this new way of looking at quantum computation will lead to new algorithms 
beyond those discussed in the next two sections. 

5 Classical statistical mechanics models 

In this section we present a set of models from statistical physics that can be defined on arbitrary 
graphs. We will see how the tensor- network formalism of the previous section can be used to 
construct efficient quantum algorithms that approximate the partition function of these models. 
After introducing these models, we will construct the corresponding tensor network and apply our 
results to produce a general quantum algorithm (Corollary []). Though these algorithms are new, 
the connection between classical statistical mechanics models and quantum computation is not, 
and has been discussed previously in jVdNDB07bl IVdNDBOTal IVdN07l IAAEL071 lULOSl IGerOSj . 
We hope that this section will enrich and clarify the nature of this connection. 

A proper introduction of these statistical models is far beyond the scope of this paper; here 
we will only provide the details necessary for understanding the tensor-network constructions. An 
interested reader can find an introduction to these models in any standard text book on the subject, 
for example, [Cal85] . 

5.1 A brief introduction to classical statistical models on graphs 

In statistical physics, one is often interested in the macroscopic behavior of a system that is made 
from a very large number of microscopic systems that interact with each other. In most cases, the 
everyday systems that we wish to describe are far too complex to be treated analytically. A common 
practice is therefore to study toy models, which are simple enough to be analyzed analytically, yet 
are rich enough to teach us something about the more realistic models. 

A very broad class of such toy models, which we call q-state models, can be defined on finite 
graphs. We consider a graph G = {V,E) and view its vertices as the microscopic subsystems that 
we wish to model; for example, the atoms of a crystal. We assume that each microscopic system can 
be found in one of q possible states that are numbered by 0, 1, . . . ,q — l- We will often refer to these 
states as "colors" , and to the labeling of all vertices as a "coloring" of G. Such a coloring is denoted 
by a vector a = {ai,a2, ■ ■ ■ ,cr\v\) that assigns a color cjj to every vertex i. A coloring completely 
specifies the microscopic state of the system. We remark that we do not require adjacent vertices 
of G to be labeled by different colors. 

Next, we use the edges of the graph to denote interactions between the vertices. For every edge 
e = {i,j) S E, we define a (real) function hij{ai,aj) (also denoted by he{cTi,aj)) that specifies the 
interaction energy between the vertices i and j. The overall energy of the system for a particular 
coloring is therefore 



To understand the macroscopic behavior of the system, we would like to know the probability 
of the system to be in a given microscopic state. Usually, one assumes that the system is attached 




(27) 
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to another, much bigger, system, which we call a "heat bath" . The attachment of the two systems 
means that energy can freely flow from one system to the other. In such a case, under fairly simple 
assumptions that will not be discussed here (see, for example, |Cal85j ). we find that the probability 
of the system to be in a microscopic state a is given by the Boltzmann-Gibbs distribution 

Here, /3 is an external parameter, which is inversely proportional to T, the temperature of the 
system, and ks is the Boltzmann constant. Z[(3) is the normalization factor, which is called the 
partition function of the system, and is given by 

Z{I3)''^Y.^~^^^''^ . (29) 

a 

It turns out that many interesting macroscopic properties of the system can be deduced solely from 
the partition function. These include the average energy of the system, its entropy, specific heat, 
and more elaborate properties such as phase-transitions |Cal85| . The calculation of the partition 
function is therefore an important task in the theory of statistical physics. 

Before explaining how this can be done in the framework of a tensor network, we list some of the 
well-known models of statistical mechanics that fall into this category, which were also discussed 
in Refs. jVdNDB07bl IVdNDB07a] : 

1. Ising Model 

In the Ising model every vertex can be colored one of two colors, or alternatively, every vertex 
denotes a classical spin that can point up or down. In order to keep the notation simple, let 
us assume that o"j holds the values {1, —1} (instead of {0, 1}). Then the interaction energy 
of every edge is simply: 

h{ai,aj) = —J(Ji(jj . (30) 

J is called the coupling constant. If J > 0, the model is called ferromagnetic. In this case, 
the neighboring spins will tend to point to the same direction. When J < 0, the model is 
called antiferromagnetic, and the spins will tend to be antialigned. 

2. Clock Model 

The g-state Clock model is a generalization of the Ising model for q colors. Here, at each 
vertex the spin can point in one of q equally spaced directions in the plane and are therefore 
specified by an angle 

27rn , , 

^n = —, n = 0,l,...,q-l . (31) 

Then the interaction energy is 

h{ai,aj) = - J cos {O^^ - e^.) . (32) 

3. Potts Model 

The the g-state Potts model is another generalization of the Ising model, simpler than the 
clock model. As in the clock model, every vertex can be in one of q colors, but instead of 
using the cosine function for the interaction energy, we use the Kronecker delta-function: 

h{ai,aj) = -J5a,,aj ■ (33) 
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The Ising, Clock, and Potts models are all examples of a class of g-state models in which 
the coupling energies only depend on the difference (modulo q) of the colors, i.e. h{ai,aj) = 
h{{(7i — (Tj) mod q); we shall call the class of models with this property difference models. In the 
above three models, the coupling constant J was the same for every edge, and we say that these 
systems have homogeneous coupling. In general, different couplings can be used. 

5.2 Constructing a tensor-network in the general case 

In this section we define a tensor network that evaluates the partition function of the general 
5-state model. Applying Theorem 13.41 to this tensor network will give a quantum algorithm for 
approximating the partition function of the general (;-state model (Corollary 15. 4p . We make no 
assumptions on the functional form of the coupling energy functions he [cti ,crj). In the next section 
we will assume that h{ai,aj) = h{ai — aj \ mod q) - an assumption that is satisfied by the Ising model, 
the Potts model, and the Clock model. For difference models, we can write a different tensor- 
network which in turn leads to a different quantum algorithm for approximating the partition 
function of difference models (Corollary IS.lOp . For difference models, the approximation scales 
in Corollarv 15.41 and 15.101 are incomparable, i.e., there exist choices of parameters for which each 
is better than the other. A tensor-network seems to be a natural tool to calculate the partition 
function of a g-state model: in both cases, we have a summation over all possible labelings/colorings. 
However, in the partition function, the summation is over a coloring of the vertices, whereas is in 
the tensor-network, the summation is over the labeling of the edges. To resolve this mismatch, we 
introduce a new graph G, by putting a new vertex in the middle of every edge of G (see Fig. [7]). 
We call these vertices energy vertices. They are denoted by Ve, and the vertices of G are denoted 
by Vq. Then V = VsU Vg, \V\ = \V\ + \E\ and \E\ = 2\E\. On this graph, we define the following 
network: 

Definition 5.1 (Tensor network for the general g-state statistical model) For the q-state 
statistical model that is defined on G = {V, E) with the interaction energy hij{-, ■) for every G 
E and an inverse temperature f3, we define the following tensor network T{G,M): 

• Graph: The graph G = {V, E) that is defined above. 

• Labeling: Every edge e ^ E can be colored in one of q possible colors: 0, 1, . . . , g' — 1. 

• Tensors: We have two types of tensors: 

1. For V £ Vg (original vertex of G), the tensor M„ is defined to be zero unless all its edges 
Gy are colored identically, in which case it is 1. 

2. For f G 14 (energy vertex), which is in the middle of an original edge {i,j) £ E of G, 
the tensor M„ is defined by 

(M,),,,,,^. e-'^'*-("-"^) . (34) 
Here ai,aj are the coloring of the two edges that connect to v. 
The following theorem shows that this tensor-network evaluates Zg. 
Theorem 5.2 The tensor-network in Def. I5.il evaluates Zg{/3). 
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Figure 7: Creating G from G by placing new "energy" vertices (unfilled) in the middle of every 
edge of G. 
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Figure 8: The n ^ m notation for describing the swallowing of a vertex. 



Proof: By Eq. ([T]), the value of the tensor network is 



T{G,M) = Y,{ 11^.(0 -11^.(0 



(35) 



I XveVo 



For every labeling /, the term (^HtjeVG ^vi^)j vanishes unless the edges connected to each original 
vertex of G are labeled identically. This defines a unique coloring of each original vertex of G, 
which we denote by a, and therefore 



T{G,M) = Y,(llM,{a)\ 



Then by Eq. dMDi we obtain 

T{G, M) = J] JJ g-/3/*,,(a„<T,) ^ g-/JWW = Zg{P) 



(36) 



(37) 



We would now like to analyze the approximation scale of the above tensor-network that results 
from a simple bubbling. To do that, let us first agree on a notation to describe the swallowing of 
a vertex: we say a vertex with n + m adjoint edges is swallowed in a n — )• m fashion if before it 
is swallowed it has n adjoint edges that end inside the bubble, and m edges that end outside the 
bubble. Figure [8] illustrates this notation. Consider now the following simple bubbling: we embed 
the parent graph G in a 3D space such that every vertex is put at a different height and every edge 
is a straight line. Thus all edges are non-horizontal. We then add the energy vertices in the middle 
of every edge of G. Our bubbling is defined by swallowing G using an horizontal plane (bubble) 
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that moves from bottom to top. The embedding ensures that every energy vertex is swallowed in 
a 1 — )• 1 fashion. The original vertices of G can be swallowed in many different ways. We set b to 
be the number of such vertices that are swallowed inaO— T-noran— )-0 fashion. 
Then the norms of this bubbling are as follow: 

• Original vertices {Vg) 

— for the b vertices that are swallowed inaO— T-norn— t-O fashion, the norm is g^/^. 
Indeed in a — )■ n swallowing, the operator the normalized state and creates the state 
|Q^®n,_l_|-[^^®n,_l_^ ^ .[.Ig,^"^"^ whose norm is q^^"^. Similarly, a n — ?• accepts any state IV') and 
outputs the state c|i7), where c is the inner product of lip) with |0)®" + |1)®" + . . . + \q)^^. 
Obviously, when |^/;) is normalized, |c| < q^^"^. 

— For the vertices that swallowed in a ?i — )• m fashion, with n > 0, ?n > 0, the norm is 
1. To see this, expand a normalized state in the domain ff®" of the operator in terms 
of the standard basis. The only terms in the expansion that will not be annihilated are 
those of the form = 0, . . . ,q — 1; they will be sent to jz)®™" in the range of the 
operator. The operator is therefore the identity operator from a g-dimensional subspace 
in the domain to a g'-dimensional subspace in the range, hence its norm is 1. 

• Energy vertices (14). These are always swallowed in a 1 — )• 1 fashion. In that case, the 
tensors act as & q x q matrix (e"''''"^""*''^-'^)^ ^ that maps the color space of one edge to the 
color space of the other edge; the norm of the tensor is the operator norm of the matrix. 

Combining all of this together, we have an approximation scale of: 

A<g''/2 J] lle-^'^^ll . (38) 

We now give an upper bound on b. Clearly b < \V\, but in many embadings of G in R^, b is 
significantly smaller than \V\. Consider the 2 dimensional lattice; rotate the lattice so that two 
opposite corners are the highest and lowest vertex of the graph. Now the bubbling described above 
only swallows two vertices in Vq (the first and the last) inaO— T-norn— t-O fashion and thus 
6 = 2. The identical argument holds for any finite lattice of any dimension. More generally, we can 
describe b in terms of directed acyclic graphs on G. 

Definition 5.3 Given a graph G = {V,E), a directed acyclic graph on G is an assignment of 
direction to each edge E so that there are no directed cycles. An extreme vertex of a directed 
acyclic graph on G is a vertex for which the adjacent edges are either all directed towards the 
vertex, or all directed away from the vertex. 

It should be clear that any bubbling induces a directed acyclic graph on G, directing any edge away 
from the vertex that is swallowed first. With this point of view, b is then the number of extreme 
points of the directed acyclic graph. 

Putting this all together we have the following: 

Corollary 5.4 (An efficient quantum algorithm for the general q'-state model) Consider 
a q-state model defined over a finite graph G = iV^E), and a directed acyclic graph on G with b 
extreme vertices. Then there exists an efficient quantum algorithm for an additive approximation 
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Figure 9: Reducing the degree of a vertex by replacing it with a low-degree tree. This reduction is 
possible for simple vertices such as the identity vertex and the cycle vertices in Sec. 15. 3[ 

of the partition function of the general q-state model that is defined over the finite graph G = {V, E) 
with coupling energies he{cri,crj)- The approximation scale of the algorithm is 

A < -Q ||g-/3fte|| _ (39) 

eeE 

Here, He"^^""!! denotes the operator norm of the qx q matrix (e"''^"^'^'''^^^)^^ . 

When G is a finite lattice (of any dimension), there is always an acyclic graph with b = 2, hence 
the resultant approximation scale is: 

A < g JJ lle-'^'^^ll . (40) 

eS-B 

We note that: 

1. In the corollary we did not restrict the underlying graph G to have a bounded degree. The 
reason is that the identity tensors are reducible in the following sense: every vertex of degree 
k that represents an identity tensor can be locally replaced by a tree graph with bounded 
degree (say, d = 3) as described in Fig. [9l The new vertices of the tree are defined to be 
identity tensors as well, and so by the connectivity of the tree, the only non- vanishing coloring 
of the external edges is the one where they are all colored identically. Moreover, in such case, 
their overall weight is 1 as required. 

This reduction does not affect the approximation scale, for if we swallow the identity tensor in 
an m — 7- n manner with m > 0, n > then we can also swallow the tree graph such no vertex 
is swallowed inaO— )-£or^— t-O manner. Therefore the norm of all the identity tensors is 
1, hence their product is also 1 - in agreement with the original scale. If, on the other hand, 
we swallow it in a — )• or A; — )• manner, then we can swallow the tree graph such that 
one vertex is swallowed in a — )• £ manner while the rest are not, thereby yielding the overall 
approximation scale g^/^ as required. 

2. The above algorithm works for any temperature (3 and any coupling energies /ie(-, •) - not 
necessarily physical ones (e.g., they can be complex). 

Shortly after the first release of this paper. Van den Nest et al |VdNDR08j published an in- 
teresting paper with closely related results. They gave quantum algorithms for several classical 
statistical-mechanics models and also provided some complementary hardness results. The models 
considered were on two-dimensional grids with restrictions on the form of the coupling energies. 
Specifically, they considered two types of statistical models: vertex models and edge models [BaxOSj . 
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Edge models are essentially g-state models, in which the particles sit on the vertices of a graph 
and the edges model the interactions between them. These include the Ising model, which was the 
actual model of this type that the authors studied. In the vertex models, the classical particles sit 
on the edges of the grid, and the vertices model many-particles interactions. 

These algorithms fit nicely in the tensor-network formalism that we presented here. Indeed, a 
close inspection reveals that they can be considered as a bubbling of a two-dimensional network 
on a grid, and therefore, as a special case of Theorem 13.41 Moreover, in the Ising model case, the 
functional form of the tensors they use is identical to the one used here. In addition, in all cases 
the bubbling is along one dimension of the grid (say, from left to right), and the couplings in the 
model are restricted so that swallowing operators are unitary (thereby directly implementable on 
a quantum computer). Finally, the authors assumed that configuration of the particles on the left 
and right boundaries of the grid is fixed. Consequently, the approximation scale in both models is 
of order one. 

To prove the BQP-hardness of these approximations, the authors showed that in both models, 
one can choose the couplings such that the resulting unitary operators form a universal set of gates 
for quantum computation. We can use their hardness result with respect to the Ising model to 
prove the following Corollary: 

Corollary 5.5 ( |VdNDR08] ) There exist classes of general q-state models for which the approx- 
imation achieved in Corollary \5.4\ is BQP-hard. 

Proof: As mentioned above, in the Ising case, |VdNDR08] uses the same tensor-network as in 
Def. 15.11 and places it on a two-dimensional grid that is swallowed form left to right. The only 
difference is that they allow the use of fixed boundary conditions. This means that the identity 
tensors on the right and left boundaries are replaced by tensors that allow only one configuration 
out the possible q. Consequently, their network is incompatible with our general (/-state model 
construction. 

Nevertheless, we can turn their fixed boundary Ising model into a general g-state model of the 
type considered here by introducing two additional vertices, one to the left and one to the right of 
the lattice. Connecting each of these to the boundary, we can impose the fixed boundary conditions 
by an appropriate choice of couplings of the new edges. For example, assume that the additional 
vertex is indexed by i and that j sits on the boundary. If we want to fix the color of boundary j 
to 0, we define the interaction between these particles by e~^^^'^'-''^^^ that is 1 for Cj = aj = and 
otherwise. This tensor-network is therefore no-longer a "pure" Ising model, but it is still a general 
g-state model. In addition, its value is identical to the value of the tensor network in |VdNDR08] . 

Swallowing this network from left to right, we notice that 6 = 2, and that the norm of the 
newly introduced edges is exactly 1. Since the rest of the energy vertices in the network produce 
unitary operators with a unit norm, Corollarv 15.41 achieves a constant approximation scale A. By 
|VdNDR08) ■ this network evaluates the result of a universal quantum computation, and therefore 
the approximation is BQP-hard. H 

5.3 Tensor networks for the Difference Models. 

In this section we restrict our attention to difference models. As we noted earlier, the Ising, 
Clock, and Potts models are all examples of difference models. For these models, we can define 
an alternatative tensor network, that together with a suitable bubbling, evaluates the partition 
function with a better approximation scale than the general case for certain choices of couplings. 
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Figure 10: The cycles in a graph and their relation with the constraints over the {Sij} variables. 
In this example, there are 3 cycles 1— )-2— )'3, 1— )-4— )'3— )•!, and 1— )'2— )-3— )-4— - but only 
2 of them are independent. They are translated into 3 consistency equations between the {Si^} 
variables (with only two being independent) as described in the text. 

The idea is to use the redundancy of the coupling energy to define the tensor network on a smaller 
Hilbert space, which leads to smaller norms, hence, a better approximation scale. 

We begin by turning G into a directed graph by placing an arrow on every edge of G. The 
direction of the arrow is unimportant. Then for every edge e = with an arrow going from j to 
i, we define a variable 6ij *== {(Ji — aj) mod q. 6ij takes on the values 0, 1, . . . , g — 1. By assumption, 
the coupling energy hij{ai,aj) depends only on that variable. 

We would like to write the partition function as a sum over all possible labeling of the delta 
variables. However, these variables are not independent: every cycle in the graph (a cyclic sequence 
of adjacent vertices) yields a consistency constraint on the variables that are associated with its 
edges; their appropriate sum or difference must yield 0. For example, in Fig. \W\ there are 3 cycles: 

• 1— )'2— ;'3— s-l: corresponds to 82^1 + ^3,2 + <^i,3 = 0. 

• 1— )-4— )'3— )•!: corresponds to ^4^1 — ^4^3 + 5i^s = 0. 

• 1— >2— ;'3— ;'4— s-l: corresponds to 62,1 + 63^2 + ^4,3 — ^4,1 = 0. 

Not all these equations are independent; the third equation is the difference of the first two, just as 
the first two cycles can be joined to obtain the third. We should therefore limit our attention to a 
set of independent cycles that correspond to a set of independent equations. One possible way to 
obtain such a set is the following: start with a spanning tree of the graph. As G is connected, the 
spanning tree must have |y| — 1 edges. Now add the remaining |£^| — \V\ + 1 edges. Every such 
edge e = {i,j) creates a new cycle that only uses e and part of the original spanning tree (since the 
i and j vertices were already connected by the spanning tree). Moreover, this cycle is independent 
of the other cycles because none of them contain the edge {i,j). This way we construct a set of 
1^1 ~ 1^1 + 1 independent cycles, which we denote by C = {Cj}. The number of independent 6 
variables is therefore \V\ — 1, one for each edge of the original spanning tree. Finally, it is easy to 
verify that this procedure can be done efficiently. 

The following lemma shows that we can sum over all labeling of {Sij} that satisfy C in order 
to obtain the partition function. 

Lemma 5.6 Let G = {V, E) he a directed graph on which a q-state difference model is defined 
with the coupling functions hij{6ij). Then its partition function is given by the following sum over 
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labeling of the delta variables, which satisfy the consistency constraints: 

ZgW) = Q Y1 n e-"^'"^'--^ ■ (41) 

consist' [i J) £E 
labeling 

Proof: We show that every consistent labeling of the {dij} variables comes from exactly q different 
colorings of the vertices of G. Given a consistent labeling of the {Sij} variables, pick a vertex vq £ V 
and assign to it a color cr^,,, GO, l,...,g — 1. Then define the color of every other vertex v £ V 
by following a path from vq to v and subtracting/adding the right 6ij variables along the path. 
The coloring of v is independent of the actual path, for otherwise two paths that produce different 
coloring of v would create a cycle whose appropriate summation of the 6 variables would not vanish. 
We have therefore used the labeling of {^ij} to define a coloring a of the vertices of G. As the 
coloring we have just defined depends on the initial assignment of vq, there are at least q labelings 
that produce {Sij}. 

These are also the only colorings that do that. Indeed, consider some coloring of the vertices. 
This coloring has some assignment to the vertex vq that corresponds to one of the q colorings that 
we have defined. By following the path from vq to any other vertex v £ V, it is easy to see that 
the two colorings must agree on all vertices and not only on vq. H 

With this result we are in a position to define a tensor network that is based on the 6 variables. 
The first step is to define the graph Gs on which the network is defined 

Definition 5.7 (The graph Gs) The graph Gs = {Vs,Es) is constructed from a graph G = {V,E) 
as follows. Pick a spanning tree for G. Denote the set of edges of the spanning tree by Etr, and let 
Ecyde '= E\Etr- As previously discussed, the choice of a spanning tree produces a set of independent 
cycles C = {Celee-BcHrfe (^e involves e and some subset of the edges of Etr)- Then we construct Gg 
in 4 steps, which are also illustrated in Fig. 

(a) We embed G in three dimensions and identify its cycles. We will use its edges and vertices 

only as a guide for the construction of Gs ■ 

(b) We place a vertex in the middle of every edge of G; vertices that correspond to e £ Etr ore 

called tree vertices, and those that correspond to e £ Ecyde are called cycle vertices. Above 
every such vertex we create another vertex, which is called an energy vertex, and connect the 
two vertices with an edge. 

(c) For every cycle Ce, we connect the cycle vertex corresponding to e with all tree vertices asso- 

ciated to edges in Etr involved in the cycle Cg. 

(d) We put a vertex in the middle of every edge that connects a tree vertex to its energy vertex. 

These are called mid vertices. For every cycle Ce we connect the cycle vertex associated to e 
with all mid vertices associated to edges in Etr involved in the cycle Ce- 

We now describe a tensor network on Gs that will evaluate the partition function. The network 
has the following tensors: 

• Tree vertices and mid vertices. These are identity tensors: they are zero unless all the colors 
of their edges are equal, in which case they are 1. 
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(d) 

Figure 11: Constructing Gs = (V^j-E^) from G = {E,V). (a) We begin with a connected graph 
G = {E, V) with a spanning tree that is denoted by the solid edges. G contains two independent 
cycles, denoted by arrows, (b) We place a tree vertex in the middle of every tree edge of G and a 
cycle vertex in the middle of every cycle edge (black filled vertices). We connect them to energy 
vertices (unfilled vertices), which are placed above, (c) We connect the cycle vertices to the tree 
vertices in their cycle, (d) We place a mid vertex (unfilled square) in the middle of every edge that 
connects a tree vertex to its energy vertex. Finally we connect the cycle vertices to the mid vertices 
in their cycle. The arrows on edges denote the bubbling order of Gg. 

• Energy vertices. These vertices have only one edge, which is associated with a 6 variable. 
Their definition is Mt,((5) = e~^^''^^\ with /ie(") being the corresponding energy function of 
the edge in the parent graph G. 

• Cycle vertices. Recall that the edges of a cycle vertex come in pairs that correspond to the 
tree vertices in that cycle: one edge connects the cycle vertex to the tree vertex and the other 
connects the cycle vertex to the associated mid vertex. 

The tensor shall be zero unless the labels of each pair are equal. When they are equal, we 
interpret the label of each pair to be the 5 value of the underlying tree edge (of the original 
graph). In addition, we interpret the label of the edge that connects the energy vertex as the 
6 variable of underlying cycle edge. When all these labelings satisfy the consistency equation 
of the cycle, the tensor is 1. Otherwise it is zero. 

The following lemma shows that the above tensor network evaluates the partition function. 

Lemma 5.8 The tensor-network that was defined above for the graph Gs = {Vs., Eg) evaluates 
q^^ Zg{P) , with G = {V,E) being the original graph from which Gs was constructed. 

Proof: According to Lemma 15.61 it is enough to show that the tensor network gives 

Yl e-^^^'^^^^'^^ , (42) 

consist' [i,j)£E 
labeling 
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where the sum is over all labelings of the delta variables {dij} that satisfy the consistency constraints. 

The tensors of the tree vertices are identity tensors. Therefore in a non-vanishing labeling of 
Gs, all edges that connect to a tree vertex must have the same labeling. This uniquely defines a 
labeling of the tree vertices. The converse is also true: any labeling of the tree vertices uniquely 
defines a non-vanishing labeling of Gs. Indeed, given a labeling of the tree vertices, we first label 
all their incident edges. This way every mid vertex has exactly one labeled edge, which determines 
the labeling of the rest of the edges. The only edges which are left unlabeled are the edges that 
connect the cycle vertices to their energy vertices. Their labeling is uniquely determined by the 
cycle constraint as manifested by the tensors of the cycle vertices. 

Now every consistent labeling of the {Sij} variables in G is uniquely determined by a labeling 
{Sij} of the tree edges, which is equivalent to a labeling of the tree vertices. Therefore every 
consistent labeling of {Sij} corresponds to a non-vanishing labeling of Gg a vice-versa. We leave it 
to the reader to verify that that for these non-zero terms, the value of the network is exactly 



Let us now analyze the approximation scale of this tensor network in a simple bubbling that is 
described in Fig. [TT] (d). We place the 4 types of vertices on 4 different horizontal planes: all the 
tree vertices are put on the lowest plane. In the plane above we place the cycle vertices, followed by 
the mid vertices and finally the energy vertices. By the definition of Gs, all edges are inter planar. 
Therefore by bubbling the graph from bottom to top, we have 4 types of norms: 

• Tree vertices: the bubbling here is — t- n. The tensor is an identity tensor and therefore the 

1 /2 

norm is q ' . 

• Cycle vertices: the bubbling here is n — )■ n + 1, where the first n connect to the tree vertices 
and the second n + 1 connect to the corresponding mid vertices as well as to the energy vertex. 
The labeling of the input edges uniquely determines the labeling of the output edges with a 
weight of unity. Therefore the norm is 1. 

• Mid vertices: the bubbling here is n — )■ 1 for these identity tensors which yields a norm 1. 

• Energy vertices: bubbling here is from 1 — t- 0. It is easy to see that the norm here is 



Multiplying these norms together and using the fact that there are exactly |y| — 1 tree vertices, we 
arrive at the following corollary: 

Corollary 5.9 The above tensor-network and bubbling yields an additive approximation to a q- 
state difference model with the approximation scale 



n = 



phi J {Si j) 





(43) 
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Notice that in the formula above we have a factor ^(1^1+^)/^ instead of gd^l"^)/^ because our tensor 
network evaluates q~^ZQ and therefore we must multiply its approximation scale by a factor of q. 

In recent work, Van den Nest et al |VdNDB07bllVdNDB07a] revealed an interesting link between 
the partition function of classical statistical models and quantum physics. They show how to 
express the g-state partition functions of the difference models as inner products between certain 
graph states and a simple tensor product state. Since both states can be efficiently generated 
by a quantum circuit, their paper contained an implicit quantum algorithm to approximate these 
partition functions |VdN07| ; analysis of the additive error produces the identical scale to Eq. (j43]) . 
Even more recently, during the time that the work presented here was being refereed. Van den 
Nest used this connection in an interesting work |VdN09a] to show that this scale is classically 
achievable. However, the approximation scale in Eq. ()43p can be improved to a scale for which no 
classical simulation result is known. The improvement that we describe now serves as an example 
of the flexibility that the tensor-network formalism offers. 

We start by combining the energy vertices of the cycle vertices into the cycle vertices. Then 
the new tensor is non-vanishing if and only if the original tensor is non-vanishing, only that now 
the non- vanishing configurations are given the appropriate energy weight. The bubbling of this 
new vertex is in n — )• n fashion, and its norm is maxj \e~^^'^'^'^^\. This is strictly smaller than the 

1 /2 

combined contribution of the original two vertices, which is ^X]j=o |e~^^'=^-'''p^ 

The second modification is to redistribute the weight of the energy vertices that correspond to 
the tree vertices. We split it equally between the energy vertex and its associated tree vertex: when 
these new vertices have all their edges labeled by the same j, their weight is V e-^^<^^o) , Under 
the same bubbling as before, the contribution of the two vertices becomes 

Ei^o which 

is smaller than or equal to the previous contribution q^l'^ ^Ej=o ■ All in all, the new 

approximation scale yields the following result: 

Corollary 5.10 (An efficient quantum algorithm for the diff"erence g-state model) Given 
a difference q-state model on a graph G = {E, V), and spanning tree Efr C E, with E^yde *== E\Etr, 
there exists an efficient quantum algorithm that provides an additive approximation of the partition 
function Zg{I3) with the approximation scale 

A = (/ [ J] |]|e-'^''^(^)| I • [ n max|e-'5'^^(^)| I . (44) 

Notice that unlike the previous approximation scale, this scale depends on the spanning tree. 
It is always smaller than or equal to the scale in Eq. ()43p . We do not know of a complementary 
hardness result, hence we cannot generally assess the quality of the approximation. However, the 
classical simulation results of |VdN09a] that apply to the approximation scale describe in Eq. (j43]l . 
cannot readily be applied to achieve the approximation scale in Corollary 15.101 |VdN09b] . 

Just as in the general case, we are not limited to physical energies, and the functions he{S) and 
/3 can be complex. Similarly, we have not restricted the shape of the original graph G because 
the resulting high-degree vertices in Gs are always reducible: they are either associated with the 
identity tensors, which, as explained in the first remark in page 1211 are reducible, or they are cycle 
vertices, which are also reducible by a similar argument. 

It is natural to wonder whether the approximation scale given by Corollary 15.101 yields better 
approximations than specializing the more general Corollary 15.41 to difference models. This will 
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depend on the specific model and parameters being considered. For all choice of parameters we 
have the following inequality: 



max e 



j=0 



Since there exist parameters to achieve both extremes in this inequality, and there are graphs for 
which 6 = 2 in Corollary [231 it follows that there exist certain models for which the approximation 
scale in Corollary 15.101 is better and others for which Corollary 15.41 is better. 

We conclude this section by applying Corollary 15.101 to the special case of the g-state Potts 
model with an homogeneous coupling J. In that case, e~^^''^^^ = e^'^ for j = 0, and e~^^'^^^^ = 1 
for < j < q. Therefore the spanning tree dependence disappears, and we obtain 

Corollary 5.11 There exists an efficient quantum algorithm that gives an additive approxima- 
tion of the partition function Zq[P) of the homogeneous q-state Potts model that is defined on 
an arbitrary graph G = {V, E) with inverse temperature /3 > and a coupling constant J . The 
approximation scale is given by 



A 



g ((; — 1 + e^*^)'^' ^ (^e^-^y^^ l^l+i ^ Ferromagnetic case (J > 0) 

■ (45) 

q {q — 1 + e^*^)'^' ^ , Antiferromagnetic case (J < 0) 



It is interesting to compare the above results to a classical result that is given in Proposition 
5.2 of [BFLWOS] . There, the authors assert that there exists straightforward classical sampling 
algorithm that provides an additive approximation for the Tutte polynomial Tg{x, y) of a connected 
graph G for x > l,y > 1 with an approximation scale — l)l^l~^. However, for such graphs, 

TG{x,y) = {x — l){y — 1)I^IZg'(/3), where Zg{(3) is the partition function of the homogeneous 
Potts model with q = {x — l){y — 1) and y = e^"^ |JVW90[ ISok05| . Therefore their classical 
algorithm provides an additive approximation for the ferromagnetic case with an approximation 
scale A' = g'^' (e'^'^)'^', and we obtain 



A' V Qe^^ 

This ratio is exponentially small as long as g > 1 and (3J > 0, and thus it may be seen as an 
indication for the non-triviality of our approximation. On the other hand, this classical result is in 
many cases better than the quantum results of Corollarv 15 .41 and Cor ollarv 1 5 . 1 1 for the homogeneous 
ferromagnetic Potts case, which questions their non-triviality in the other cases, and emphasizes 
the crucial role of the bubbling. 

Finally, |BFLW05| also presents a simple additive approximation for the chromatic polynomial 
Pg{q) that counts the number of legal g-colorings of the graph G. They show that there exists a 
classical additive approximation for Pg{q) for integer g's, whose approximation scale is {q — 1)I^L 
Moreover, this scale is tight, in the sense that for any < 5 < q — 1, an additive approximation 
with scale {q — 1 — (5)1^1 is NP-hard. 

It is easy to verify that the chromatic polynomial is obtained from the antiferromagnetic par- 
tition function of the homogeneous Potts model for e^"^ — )• (i.e., J < and /? — )■ +oo). Not 
surprisingly, in such case the approximation scale of Corollarv 15. Ill is equivalent to {q— 1)1^1 - the 
classical result. 
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6 Tensor networks and the Jones and Tutte polynomials. 



Recently, efficient quantum algorithms have been given for additively approximating certain topo- 
logical/combinatorial quantities: the Jones polynomial of braids at roots of unity |FLW021 IFKW02| 
IFKLW02| lAJLOGj and the Tutte polynomial of planar graphs [AAELOT] . Broadly speaking, both 
results can be viewed in three steps: 

1. The problem is mapped into a combinatorial calculation within the Temper ley-Lieb algebra. 

2. Representation theory of the Temper ley-Lieb algebra is used to translate the combinatorial 
problem into a linear-algebra problem. 

3. A quantum algorithm is given for approximating the solution to the linear-algebra problem. 

This final step can be seen as the approximation of a particular tensor network. Without going 
into the details, the rough description of the tensor network for the two problems is as follows: 

• The Jones Polynomial of a Braid. Here the tensor network is derived from the braid 
by closing up the loose strands of the braid and then replacing every crossing by a vertex 
corresponding to a rank 4 tensor, and inserting a vertex at any local maximum or minimum 
of the strands. 

• The Tutte Polynomial of a planar Graph. Here, the original graph G is replaced by 
a so called medial graph, which features a rank-four tensor at the center of every edge of G. 
Again, as in the Jones Polynomial case, rank two tensors are inserted at any local maximum 
or minimum. 

Even with this rough description, an intuitive understanding of the nature of the errors given 
in these works can be obtained. In the Jones Polynomial case, the parameter being a root of 
unity ensures that the rank 4 tensors can be swallowed 2 — )• 2 such that the swallowing is a 
unitary operator and hence does not effect the scale A. What remains is the cost of swallowing the 
maximum and minimum tensors inaO— 7'2or2— t-O fashion, each of which contributes a factor of 
yjq. In the Tutte Polynomial case, the contributions to A include the previous cost of the rank 2 
tensor swallowing, but in addition, unlike the Jones Polynomial case, also include a cost for each 
2 — )• 2 swallowing of the rank 4 tensors (in the language of this paper, these quantities are the 
||p(7i)|| terms). This is because in |AAEL07] . the crossing operators are not necessary unitary. 

We have previously discussed the need to carefully examine the nature of the additive error that 
Theorem 13 . 41 provides. In the context of these problems, the non-trivial nature of the approximation 
has been established by showing that for certain sets of parameters, the level of approximation 
provided by both algorithms has been shown to be a complete problem for quantum computation 

|FLwn2i lYWnHl lAAnoi IAAELn7j . 

7 Conclusions and open questions 

We have given a quantum algorithm that additively approximates the value of a tensor network to 
a certain scale. As an application of the algorithm, we have obtained new quantum algorithms that 
approximate the partition functions of certain statistical mechanical models including the Potts 
model. 
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The fact that the approximation is additive and depends on the approximation scale is by no 
means a minor point: for a given algorithm, with large enough approximation error, the approxima- 
tion is useless and the algorithms are trivial, or at least can be matched classically. We have shown 
that in some cases, the approximation scale of the algorithm is BQP-hard, and therefore, some 
instances of the problem are highly nontrivial. We consider this to be an important but indirect 
verification that the approximation scale is non-trivial. What is missing is a direct argument: an 
argument that would say the approximation scale is good enough to answer a question directly con- 
nected to the quantities being estimated (i.e. topological invariants, statistical mechanical models). 
Such an argument would represent a significant advance. 

Our intuition is that the tensor network point of view should be helpful for the design of new 
quantum algorithms in the future. This is motivated by the fact that from the tensor network 
viewpoint two core features of quantum circuits, the unitarity of the gates, and the notion of time 
(i.e. that the gates have to be applied in a particular sequence), are replaced by more flexible 
features. The unitary gate is replaced by an arbitrary linear map encoded in each tensor and the 
notion of time is replaced by the geometry of the underlying graph of the tensor network along 
with a choice of bubbling of the network. Hence, the design of algorithms from the tensor-network 
point of view requires two things: a tensor network whose value is the quantity of interest, and a 
specification of a bubbling order of the vertices of the underlying graph. It often seems that for a 
specific problem there are several somewhat natural tensor networks with the right value to choose 
from and that the approximation scale can vary quite dramatically between the choices (as was 
the case in the difference statistical mechanical models). Additionally, for a given tensor network, 
the choice of ordering can make a significant difference as well. The analysis of these issues have a 
combinatorial and graph theoretic flavor. It would be interesting to understand the computational 
complexity of finding an optimal or even a reasonably good choice of bubbling for a given tensor 
network. 

It is also intersting to see if the tensor-network framework can help to understanding other 
models of quantum computation. For example, in the one clean qubit model of quantum compu- 
tation, we have a universal quantum computer that is allowed to operate only on one clean qubit 
(initialized to |0)), while the rest of qubits are in a completely mixed state p = 1 |KL98j . The 
result of such a computation can be seen as the trace of a product of 4 operators: a quantum 
circuit U times local projection Q, times and another local projection P. In the tensor-network 
setting, the trace of these 4 operators will look like edges connecting one part of the chain to its 
other part. It is therefore, not surprising that the estimation of the Jones polynomial of the trace 
closure of a braid is known to be in this model (in fact, recently Shor & Jordan have shown that it 
is complete for this model |SJ08j ). because the trace closure of a braid, when interpreted as a tensor 
network, translates to the (weighted) trace operation in a quantum computation. However, as the 
same tensor-network can be graphically presented and bubbled in many different ways, it might 
be hard to identify the trace operation that hides in a particular layout of the network. It would 
therefore be interesting to see if there exists a more natural way to characterize these networks, 
using, perhaps, some property of the network that is invariant to the way in which it is presented. 
Such characterization might shed some more light on this interesting complexity class. 

It is also interesting to consider the idea of additive approximations from a complexity theory 
point of view. In a recent paper, Goldberg & Jerrum studied the complexity of a multiplicative 
approximation (FPRAS) for the Tutte polynomial |GJ07j . They map about three quarters of the 
Tutte plane, distinguishing between points where there is an FPRAS and points where an FPRAS 
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is NP-hard. It would be interesting to do the same with respect to additive approximations. In 
hght of the recent quantum algorithms for the Jones and Tutte polynomials, as well the results of 
this paper, it seems that additive approximations are a natural framework for quantum algorithms. 
We therefore hope that quantum hardness results can be used to map regions in the Tutte plane 
which are inaccessible to classical additive approximations with certain approximation scales (unless 
BPP = BQP). It is also interesting to understand the relationship between such points and other 
points and approximation scales where an additive approximation is NP-hard. The first few steps 
in that direction were taken by jBFLWOS] . and we hope that the algorithms and techniques of this 
paper can be used to further advance these ideas. 

Finally, we briefly mention two other directions of inquiry that might be of interest. The first 
is to see whether there is a natural extension of our tensor- network definition of the BQP class to 
the QMA class (or more likely the QCMA class). Can such a definition shed new light on these 
complexity classes? A related problem is to find a QMA-complete problem that is naturally cast in 
the language of tensor-networks. 

The second direction is to understand the structure of universal sets of tensors, i.e. sets of 
elementary tensors that can be efficiently contracted to approximate any other tensor. So far, 
such sets were found solely using techniques from quantum computations: one begins with a set of 
transformations that form a dense subgroup SU{N) or SL{N), and then proves universality using 
the (either unitary or non- unitary - see [AAELOT] ) Solovay-Kitaev theorem. The set of universal 
transformations yields a set of universal tensors. It is therefore interesting to see if there exist 
other, perhaps more direct, techniques to prove such universality, techniques that do not rely on 
heavy machinery from the theory of Lie groups. 
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A The Hadamard test 

The Hadamard test is a simple and well-known quantum algorithm that approximates the inner 
product (a|[/|a) for a normalized state \a) and a unitary operator U that can be efficiently gen- 
erated. As it is an important part of our main result Theorem 13.41 we include it here for sake of 
completeness. 

Theorem A.l (The Hadamard test) Let \a) be a normalized state that can be efficiently gen- 
erated (e.g., a tensor product |0)®"J, and let U be a unitary operator that can be implemented on a 
quantum computer in time T. Then there exists a quantum algorithm that for every e > outputs 
a complex number r such that 



Pr (^|(a|[/|Q) -r| > ej < 1/4 , (47) 
and the running time of the algorithm is 0{e~^T). 

Proof: We add an ancillary qubit to the system and initialize it in the state {ipo) = |a)|0). Acting 
with the Hadamard gate H = -^{\ ~i) on the ancillary qubit, we get {tpi) = -^l^) ^ (|0) + |1)). 
The next step is to act with U on the registers of a conditioned on the anciallary qubit. The result 
is \il)2) = (let) "S?) |0) + (C/|q)) (8) 1 1)) . Finally we act again with the Hadamard gate on the ancillary 
qubit and obtain 

iV's) = ^ [\a) ^ |0) + |a) |1) + {U\a)) |1) - {U\a)) ® |0)] . (48) 



34 



We measure the ancillary qubit and output the number 1 for |1) and —1 for |0). We repeat 
this process N times and store the results in the variables xi, . . . ,xn. These are independent 
identically distributed random variables with an average E{xi) = Re(a|J7|a) since Pr(xj = 1) = 
\[2 + 2Re(a|C/|a)] and Y>i{xi = -1) = \[2 - 2Re(a|C/|a)]. We can therefore use the Chernoff- 
Hoeffding bound and obtain 



Thus taking N = ©(e'^)^ 

we obtain the right approximation for Re(a|C/|Q). 
To approximate the imaginary part, we change the first step such that = -^|a)(8'(|0) — i|l)), 
and proceed in the same way. All in all, the entire algorithm runs in 0{Te~'^) quantum time. 
Notice that we can replace the 1/4 factor in (j47p by any constant 6 > and obtain a running 




(49) 



i=l 



time of 0{Te-^ log 5). 
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